33 research outputs found

    Disjoint pattern matching and implication in strings

    Get PDF
    We deal with the problem of whether a set of string patterns implies the presence of a fixed pattern. While checking whether a set of patterns occurs in a string is solvable in polynomial time, this implication problem is well-known to be intractable. Here we consider a version of the problem when patterns in the set are required to be disjoint. We show that for such a version of the problem the situation is reversed: checking whether a set of patterns occurs in a string is NP-complete, but the implication problem is solvable in polynomial time. 1 Introduction and the main result The problem we consider in this note was motivated by answering queries in incompletely specified XML documents. Suppose that L is a set of letters, or labels, assumed to be countably infinite, and that is a special symbol (wildcard) not in L. By L we denote LâˆȘ {}. A pattern is a finite string over L. If a string s over L matches a pattern π, we write s | = π. More precisely, if s = a0...an−1 an

    Representing and Querying Incomplete Information: a Data Interoperability Perspective

    Get PDF
    This habilitation thesis presents some of my most recent work, which has been done in collaboration with several other people. In particular this thesis concentrates on our contributions to the study of incomplete information in the context of data interoperability. In this scenario data is heterogenous and decentralized, needs to be integrated from several sources and exchanged between different applications. Incompleteness, i.e. the presence of “missing” or “unknown” portions of data, is naturally generated in data exchange and integration, due to data heterogeneity. The management of incomplete information poses new challenges in this context.The focus of our study is the development of models of incomplete information suitable to data interoperability tasks, and the study of techniques for efficiently querying several forms of incompleteness

    When is naive evaluation possible?

    Get PDF
    HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés

    Querying Incomplete Data : Complexity and Tractability via Datalog and First-Order Rewritings

    Full text link
    To answer database queries over incomplete data the gold standard is finding certain answers: those that are true regardless of how incomplete data is interpreted. Such answers can be found efficiently for conjunctive queries and their unions, even in the presence of constraints. With negation added, the problem becomes intractable however. We concentrate on the complexity of certain answers under constraints, and on effficiently answering queries outside the usual classes of (unions) of conjunctive queries by means of rewriting as Datalog and first-order queries. We first notice that there are three different ways in which query answering can be cast as a decision problem. We complete the existing picture and provide precise complexity bounds on all versions of the decision problem, for certain and best answers. We then study a well-behaved class of queries that extends unions of conjunctive queries with a mild form of negation. We show that for them, certain answers can be expressed in Datalog with negation, even in the presence of functional dependencies, thus making them tractable in data complexity. We show that in general Datalog cannot be replaced by first-order logic, but without constraints such a rewriting can be done in first-order. The paper is under consideration in Theory and Practice of Logic Programming (TPLP).Comment: Under consideration in Theory and Practice of Logic Programming (TPLP

    A Simple Algorithm for Consistent Query Answering under Primary Keys

    Full text link
    We consider the dichotomy conjecture for consistent query answering under primary key constraints stating that for every fixed Boolean conjunctive query q, testing whether it is certain over all repairs of a given inconsistent database is either polynomial time or coNP-complete. This conjecture has been verified for self-join-free and path queries. We propose a simple inflationary fixpoint algorithm for consistent query answering which, for a given database, naively computes a set Δ\Delta of subsets of database repairs with at most kk facts, where kk is the size of the query qq. The algorithm runs in polynomial time and can be formally defined as: 1. Initialize Δ\Delta with all sets SS of at most kk facts such that SS satisfies qq. 2. Add any set SS of at most kk facts to Δ\Delta if there exists a block BB (ie, a maximal set of facts sharing the same key) such that for every fact aa of BB there is a set Sâ€Č∈ΔS' \in \Delta contained in (SâˆȘ{a})(S \cup \{a\}). The algorithm answers "qq is certain" iff Δ\Delta eventually contains the empty set. The algorithm correctly computes certain answers when the query qq falls in the polynomial time cases for self-join-free queries and path queries. For arbitrary queries, the algorithm is an under-approximation: The query is guaranteed to be certain if the algorithm claims so. However, there are polynomial time certain queries (with self-joins) which are not identified as such by the algorithm

    Reasoning about XML with temporal logics and automata

    Get PDF
    We show that problems arising in static analysis of XML specifications and transformations can be dealt with using techniques similar to those developed for static analysis of programs. Many properties of interest in the XML context are related to navigation, and can be formulated in temporal logics for trees. We choose a logic that admits a simple single-exponential translation into unranked tree automata, in the spirit of the classical LTL-to-BĂŒchi automata translation. Automata arising from this translation have a number of additional properties; in particular, they are convenient for reasoning about unary node-selecting queries, which are important in the XML context. We give two applications of such reasoning: one deals with a classical XML problem of reasoning about navigation in the presence of schemas, and the other relates to verifying security properties of XML views

    Reasoning About Pattern-Based XML Queries

    Get PDF
    Abstract. We survey results about static analysis of pattern-based queries over XML documents. These queries are analogs of conjunctive queries, their unions and Boolean combinations, in which tree patterns play the role of atomic formulae. As in the relational case, they can be viewed as both queries and incomplete documents, and thus static analysis problems can also be viewed as finding certain answers of queries over such documents. We look at satisfiability of patterns under schemas, containment of queries for various features of XML used in queries, finding certain answers, and applications of pattern-based queries in reasoning about schema mappings for data exchange.

    Datalog Rewritings of Regular Path Queries using Views

    Get PDF
    We consider query answering using views on graph databases, i.e. databases structured as edge-labeled graphs. We mainly consider views and queries specified by Regular Path Queries (RPQ). These are queries selecting pairs of nodes in a graph database that are connected via a path whose sequence of edge labels belongs to some regular language. We say that a view V determines a query Q if for all graph databases D, the view image V(D) always contains enough information to answer Q on D. In other words, there is a well defined function from V(D) to Q(D). Our main result shows that when this function is monotone, there exists a rewriting of Q as a Datalog query over the view instance V(D). In particular the rewriting query can be evaluated in time polynomial in the size of V(D). Moreover this implies that it is decidable whether an RPQ query can be rewritten in Datalog using RPQ views
    corecore